Bioinformatics of the Brain (Kayhan Erciyes, Tuba Sevimoğlu)

234

Bioinformatics of the Brain

9.4.1.3

Modularity Measures

Modularity parameter provides a measure of goodness of the discovered mod-

ules in a network. Given a graph G = (V, E), percentage of edges in a module

Mi can be stated in Eqn. 9.9, giving the percentage of edges inside a module

eii = |{(uv) : u ∈Vi, v ∈Vi, (u, v) ∈E|/|E|

(9.9)

Let ai be the percentage of edges with at least one end in Mi specified as

in Eqn. 9.10:

ai = |{(uv) : u ∈Vi, v ∈V \ Vi, (u, v) ∈E|/|E|

(9.10)

Modularity of a complex network with k modules can now be defined as

in Eqn. 9.11, providing the quality of all modules discovered in the network.

In other words, we evaluate the ratio of edges totally contained in modules to

the edges that have one end in a module.

Q =

i=1

(eii −a²

i ⁾

(9.11)

Based on the modularity parameter, modularity-based clustering algo-

rithm proposed by Newman has the following steps. This algorithm with time

complexity O(m + n)n) is classified as agglomerative hierarchical clustering

algorithm as it iteratively forms larger clusters.

Each node v ∈V is a cluster initially.

Repeat

•Merge two clusters Ci and Cj that will increase modularity M

by the largest amount.

Until merging any two clusters does not improve M.

9.4.1.4

Spectral Clustering

This method of graph clustering uses algebraic properties of the graph that

represents the complex network. The Laplacian matrix of a graph G = (V, E)

is L = D −A where D is the degree matrix with di ∈D is a diagonal

element representing degree of vertex i and A is the adjacency matrix. In

normalized form, L = I −D¹^/²AD−1/2. The eigenvalues of L are real and

symmetric and the second eigenvalue of L is called the Fiedler value and the

corresponding eigenvector is denoted as Fiedler vector. The spectral bisection

procedure using Fiedler vector is defined as follows. Having constructed the

Fiedler vector, each entry is compared with a threshold value, commonly 0

or the median value of the Fiedler vector, and a vertex that has a smaller or

equal value to the threshold is placed in cluster 1; any vertex with a larger

value is stored in cluster 2. This algorithm may be implemented recursively

to discover k clusters.